Cost-Sensitive Fault Remediation for Autonomic Computing
نویسندگان
چکیده
We introduce a formal model of cost-sensitive fault remediation, derive an exact algorithm for solving the special case of deterministic observations, and demonstrate it on two example problems. This effort is part of two self-healing software projects that attempt to use collected data for better decision making in emerging autonomic systems.
منابع مشابه
An Instance-Based State Representation for Network Repair
We describe a formal framework for diagnosis and repair problems that shares elements of the well known partially observable MDP and cost-sensitive classification models. Our cost-sensitive fault remediation model is amenable to implementation as a reinforcementlearning system, and we describe an instance-based state representation that is compatible with learning and planning in this framework...
متن کاملInjecting Robustness into Autonomic Grid Systems
Autonomic computational grids are self-organizing software systems that pool the computational resources of large public networks to solve computationally-intensive problems. While autonomic grids can scale to networks far larger than centralized grids, they have not seen the same adoption and success in industry due to an incomplete treatment of fault tolerance. In this paper, we propose two c...
متن کاملAutonomic personal computing
Autonomic personal computing is personal computing on autonomic computing platforms. Its goals combine those of personal computing with those of autonomic computing. The challenge of personal autonomic computing is to simplify and enhance the end-user experience, delighting the user by anticipating his or her needs in the face of a complex, dynamic, and uncertain environment. In this paper we i...
متن کاملAutonomic Fault Management for Wireless Mesh Networks
Wireless Mesh Network (WMN) provides a cheaper option for backhauls that can be leveraged to provide low-cost access services. Compared to conventional wireless LANs, the benefits of a WMN include greater range because of packet relaying and higher throughput because of shorter hops. A WMN, however, may be subject to a variety of faults that are hard to diagnose manually. In this paper we discu...
متن کاملImproving the palbimm scheduling algorithm for fault tolerance in cloud computing
Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...
متن کامل